library(htmltools)

APP_URL <- "https://pchen-aitutor.hf.space"  # ← your Gradio app URL
USER_ID <- "anonymous_user"

browsable(tagList(
  # --- styles (fab, bubble, modal) ---
  tags$style(HTML("
    #tutor-fab{position:fixed;right:20px;bottom:20px;z-index:99999;padding:12px 14px;border-radius:999px;border:1px solid #ddd;background:#fff;box-shadow:0 8px 24px rgba(0,0,0,.15);cursor:pointer;font-weight:600}
    #tutor-fab:hover{background:#f6f8ff}
    #tutor-sel-btn{position:absolute;display:none;z-index:100000;padding:6px 10px;font-size:12px;border:1px solid #e5e7eb;border-radius:10px;background:#111827;color:#f9fafb;box-shadow:0 8px 24px rgba(0,0,0,.2);cursor:pointer;user-select:none}
    #tutor-modal-overlay{display:none;position:fixed;z-index:99998;inset:0;background:rgba(0,0,0,.35)}
    #tutor-modal{position:absolute;top:5%;left:50%;transform:translateX(-50%);width:min(980px,95vw);height:min(720px,90vh);background:#fff;border-radius:10px;box-shadow:0 10px 30px rgba(0,0,0,.25);overflow:hidden}
    #tutor-modal-header{display:flex;justify-content:space-between;align-items:center;padding:10px 14px;background:#f5f5f7;border-bottom:1px solid #e5e5ea;font-weight:600}
    #tutor-close-btn{cursor:pointer;border:none;background:transparent;font-size:18px}
    #tutor-iframe{width:100%;height:calc(100% - 18px);border:0}
  ")),

  # modal + iframe
  tags$div(id="tutor-modal-overlay",
    tags$div(id="tutor-modal",
      tags$div(id="tutor-modal-header","AI Calculus Tutor", tags$button(id="tutor-close-btn","✕")),
      tags$iframe(id="tutor-iframe", src="")
    )
  ),

  # floating button + selection bubble
  tags$button(id="tutor-fab","Ask Tutor"),
  tags$button(id="tutor-sel-btn","Ask Tutor"),

  # script
  tags$script(HTML(sprintf("
    (function(){
      const APP_URL = '%s';
      const USER_ID = '%s';

      function sel(){ try { return (window.getSelection?window.getSelection().toString():'').trim(); } catch(e){ return ''; } }
      function openInline(q, action){
        const p = new URLSearchParams({ user_id: USER_ID, q: q||'', action: action||'prefill' });
        const frame = document.getElementById('tutor-iframe');
        frame.src = APP_URL + '/?' + p.toString();          // Prefill handled by app from URL
        document.getElementById('tutor-modal-overlay').style.display='block';
      }
      function openTab(q, action){
        const p = new URLSearchParams({ user_id: USER_ID, q: q||'', action: action||'prefill' });
        window.open(APP_URL + '/?' + p.toString(), 'ai_tutor_tab', 'noopener');
      }

      // floating button: prefill by default; Alt=send; Ctrl/Cmd=new tab
      document.getElementById('tutor-fab').addEventListener('click', function(ev){
        const q = sel();
        const act = ev.altKey ? 'send' : 'prefill';
        if (ev.ctrlKey || ev.metaKey) openTab(q, act); else openInline(q, act);
      });

      // selection bubble near highlighted text
      const b = document.getElementById('tutor-sel-btn');
      let last='';
      function place(x,y){ b.style.left=(x+10)+'px'; b.style.top=(y+10)+'px'; }
      function show(){ b.style.display='block'; } function hide(){ b.style.display='none'; }
      document.addEventListener('mouseup', e => setTimeout(() => {
        const t = sel(); if (t && t.length>=2 && t.length<=800){ last=t; place(e.pageX,e.pageY); show(); } else hide();
      }, 20));
      window.addEventListener('scroll', hide, {passive:true}); window.addEventListener('resize', hide);

      // bubble click: prefill by default; Alt=send; Ctrl/Cmd=new tab
      b.addEventListener('click', function(ev){
        if (!last) return;
        const act = ev.altKey ? 'send' : 'prefill';
        if (ev.ctrlKey || ev.metaKey) openTab(last, act); else openInline(last, act);
        hide();
      });

      // close modal
      const overlay = document.getElementById('tutor-modal-overlay');
      document.getElementById('tutor-close-btn').addEventListener('click', () => overlay.style.display='none');
      overlay.addEventListener('click', e => { if (e.target===overlay) overlay.style.display='none'; });
    })();
  ", APP_URL, USER_ID)))
))
AI Calculus Tutor

Learning Objectives

By the end of this section, students will be able to:

  1. Understand first and second-order conditions for unconstrained optimization problems
  2. Use Lagrange multipliers to solve equality-constrained optimization problems
  3. Understand KKT conditions for inequality-constrained optimization
  4. Solve business optimization problems using analytic and computational methods

R skills

  1. Solve univariable unconstrained optimisation problem using optimize()
  2. solve multivariable unconstrained optimisation problem using optim()
  3. Solve multivariable constrained optimisation problem using solnp()

Contents

  • Unconstrained optimization
    • Univariable cases
    • Multivariable cases
  • Optimization with equality constraints
    • Lagrange method
  • Optimization with inequality constraints
    • KKT conditions

Knowledge Point 1: Unconstrained Optimization

Unconstrained optimization seeks to find the maximum or minimum of a function without any restrictions on the variables. This forms the foundation for understanding more complex constrained optimization problems.

The following interactive diagram is designed to guess the location of the top or the bottom of a function. The Secant line is used to predict how the function may proceed.

The objective is to stop right at the top or bottom of the function. You can choose different functions from the drop down manue to run the animation.

The animation shows:

  • the secant point distance matters. The closer the distance, the better it predicts the further proceeding. Therefore we use tangent instead of secant.
  • At a bottom or a top the tangent is is flat. But flat tangent not always implies a bottom or a top. A flat tangent \(f'=0\) is a necessary condition but not a sufficient condition.
  • The change of the tangent from downwards sloping to upwards sloping gives a minimum, while the change from upwards sloping to downwards sloping gives a maximum. The first order test determines whether a critical point is a maximum or minimum.

1.1 Univariable Cases

Formulation of The Problem

For a function \(f: \mathbb{R}\to \mathbb{R}\), we want find the minimum or the maximum of the function \(f(x).\) Since a maximisation problem can always ne formulated a minimisation problem of its negation \(\max_{x\in\mathbb{R}}f(x) \Rightarrow \min_{x\in\mathbb{R}}-f(x)\), we will focus on the minimisation problems.

\[\min_{x \in \mathbb{R}} f(x)\]

First-Order Condition: For a function \(f(x)\) to have a local extremum at \(x = a\), it is necessary that: \[f'(a) = 0\]

Second-Order Condition: For a critical point \(x = a\) where \(f'(a) = 0\):

  • If \(f''(a) > 0\), then \(x = a\) is a local minimum
  • If \(f''(a) < 0\), then \(x = a\) is a local maximum
  • If \(f''(a) = 0\), the test is inconclusive

Example 5.1: TechStart Pricing Strategy

TechStart Inc. wants to maximize profit from their new software product. Market research shows demand follows \(D(p) = 1000 - 10p\) and production costs are \(C(q) = 5000 + 20q\). The profit function is:

\[\pi(p) = p \cdot D(p) - C(D(p)) = p(1000 - 10p) - (5000 + 20(1000 - 10p))\] \[\pi(p) = 1000p - 10p^2 - 25000 + 200p = 1200p - 10p^2 - 25000\]

Analytic Solution

Step 1: First-Order Condition \[\pi'(p) = 1200 - 20p = 0\] \[p^* = 60\]

Step 2: Second-Order Condition \[\pi''(p) = -20 < 0\]

Since \(\pi''(60) < 0\), \(p = 60\) gives a local maximum profit. Since the quadratic function has only one critical point, it is also the global maximum.

Step 3: Calculate Maximum Profit \[\pi(60) = 1200(60) - 10(60)^2 - 25000 = 72000 - 36000 - 25000 = 11000\]

Numerical R Solution

profit <- function(p) {
  1200*p - 10*p^2 - 25000
}

optimize(profit, interval = c(0, 200), maximum = TRUE)
## $maximum
## [1] 60
## 
## $objective
## [1] 11000

Exercise

Nonlinear Univariable Optimization Problem

A firm sells a premium product. Market demand decreases nonlinearly with price: \[q(p) = 1500\, e^{-0.04p},\] where \(p \ge 0\) is the price (in dollars) and \(q(p)\) is quantity demanded.

The firm’s total cost consists of a fixed overhead and a per-unit cost proportional to output: \[C(p) = 3000 \;+\; 18\, q(p).\] Revenue at price \(p\) is \(R(p) = p \cdot q(p).\)

Find the profit maximising quantity \(q\).

Complete the task using optimize() in following R Environment or in your local RStudio.

1.2 Multivariable Cases

Formulation of the Problem

For a multivariable function \(f:\mathbb{R}^n \to \mathbb{R}\) we want find the minimum.

\[\min_{x \in \mathbb{R}^n} f(x)\]

First-Order Condition: For a function \(f:\mathbb{R}^n \to \mathbb{R}\) to have a local extremum at \(x = x_0\), it is necessary that: \[\nabla f(x_0) = 0\]

Second-Order Condition: For a critical point \(x = x_0\) where \(\nabla f(x_0) = 0\): \[H = \nabla^2 f(x_0)\]

  • If \(H\) is positive definite the critical point \(x_0\) is a minimum
  • If \(H\) is negative definite the critical point \(x_0\) is a maximum
  • If \(H\) is indefinite the critical point \(x_0\) is a saddle point

For \(n=2\), i.e. \(f\) is a function of two variables \(f(x,y)\) we have

\[\nabla f(x) = \begin{pmatrix} f_{x} \\ f_{x} \end{pmatrix}\]

The second-order derivatives define the Hessian matrix: \[H = \begin{pmatrix} f_{xx} & f_{xy} \\ f_{yx} & f_{yy} \end{pmatrix}\]

Example 5.2

Sarah owns a coffee shop that sells two main products: coffee and pastries. Her daily revenue depends on both the number of cups of coffee sold \(x\) and the number of pastries sold \(y\). Through market research, she has determined that her revenue function is

\[R(x, y) = 120x + 100y - 2x² - y² - xy\]

How can Sarah maximise her revenue?

Analytic Solution

Step 1: Find Critical Points by FOC

  • \(\nabla f =\begin{pmatrix} \frac{\partial R}{\partial x} \\ \frac{\partial R}{\partial y}\end{pmatrix}=\begin{pmatrix} 120-4x-y \\100 -2y -x \end{pmatrix}=\begin{pmatrix} 0 \\ 0\end{pmatrix}\)

  • Rearrange in matrix form \[\begin{pmatrix} 4 & 1 \\ 1 & 2 \end{pmatrix}\begin{pmatrix} x \\y\end{pmatrix}=\begin{pmatrix} 120 \\100\end{pmatrix} \]

  • Solve the two equation system

A <- matrix(c(4,1,1,2),2,2,byrow=TRUE)
b = c(120,100)
solve(A)%*%b
##      [,1]
## [1,]   20
## [2,]   40

Step 2: Compute eigenvalues of the Hessian matrix

  • \(\frac{\partial^2 R}{\partial x^2} = -4\)
  • \(\frac{\partial^2 R}{\partial y^2} = -2\)
  • \(\frac{\partial^2 R}{\partial x\partial r} = -1\)

\[H = \begin{pmatrix} -4 & -1 \\ -1 & -2 \end{pmatrix}\]

H<- matrix(c(-4,-1,-1,-2),2,2,byrow=TRUE)
eigen(H)$values
## [1] -1.585786 -4.414214

Step 3: Apply Second-Order Test

Since eigen(H)$values < 0, hence the Hessian matrix \(H\) is negative definite, the critical point (20, 40) is a local maximum. Since there is only one critical point, the local maximum is the global maximum.

Step 4: Calculate Maximum Profit \[R(20, 40) = 120(20) + 100(40) - 2(20)² - (40)² - (20)(40) = 3200\]

Numerical R Solution

# Define the negative profit function (because optim minimizes by default)
profit <- function(v) {
  x <- v[1]
  y <- v[2]
  - (120*x + 100*y - 2*x^2 - y^2 - x*y)  # Negative for maximization
}

# Use optim to maximize profit
optim(par = c(0, 0), fn = profit)
## $par
## [1] 20 40
## 
## $value
## [1] -3200
## 
## $counts
## function gradient 
##      163       NA 
## 
## $convergence
## [1] 0
## 
## $message
## NULL

Graphic Illustration of the critical point

library(plotly)

A<- matrix(c(-4,-1,-1,-2),2,2,byrow=TRUE)

# Create a grid over [-10, 10]
x <- seq(-0.3, 0.3,0.01 )
y <- seq(-0.3, 0.3,0.01)

z <- outer(x, y, function(x, y) {
  mapply(function(x, y) {
    X <- c(x, y)
    t(X) %*% A %*% X
  }, x, y)
})



plot_ly(x = x, y = y, z = z, type = "surface") |>
  layout(title = "Interactive Surface: z = xᵀAx",
         scene = list(
           xaxis = list(title = "dx"),
           yaxis = list(title = "dy"),
           zaxis = list(title = "dz")))

Exercise

A company produces three products: \(x\), \(y\), and \(z\).
The total revenue (in thousands of dollars) is modeled by: \[f(x,y,z) = 40x+50y+6-z-2x^2-2.5y^2-3z^2\] where: - \(x, y, z \ge 0\) are the quantities of the three products (in hundreds of units).

Task:
Find the production levels \((x, y, z)\) that maximize revenue.

Instructions:

-1. Write the gradient vector \(\nabla R(x,y,z)\) by computing the partial derivatives with respect to \(x\), \(y\), and \(z\).

-2. Set \(\nabla R(x,y,z) = 0\) and solve for the critical point(s).

-3. Form the Hessian matrix and evaluate it at each critical point.

-4. Use the second-order test for multivariable functions to determine whether each point is a local maximum, local minimum, or saddle point.

-5. Interpret the optimal production quantities and the maximum revenue.

Solve the problem numerically by optim() to confirm your analytic solution. You can complete the task in following R Environment or in your local RStudio.

Remarks: You can study the options of optim() by put “?optim” into R to increase the precision of the solution.

Interactive Quiz 5.1


Knowledge Point 2: Equality-Constrained Optimization (Lagrange Multipliers)

Example 5.3

GreenInvest, an ESG-focused investment firm, faces a portfolio optimization problem involving the allocation of $100 million across three green tech companies: SolarTech, EcoTransport, and CleanWater. Each offers different returns and risks, requiring the firm to maximize expected returns while strictly adhering to the constraint that total investment equals exactly $100 million.

This constraint transforms the problem from unconstrained optimization to a constrained optimization task. The Lagrange multiplier method is the standard method used to solve constrained optimization problems.

Now we want to motivate the Lagrange multiplier method. Constrained optimization can be seen as a problem of finding the highest point along a mountain road, where the height of the mountain is the value of the objective. We want to go as high as possible. The road presents the restriction: we are not allowed to leave the road.

When a road rises higher, it cuts the contour lines
When a road rises higher, it cuts the contour lines

The contour lines of the mountain give the same heights on the mountain respectively. Whenever the road crosses these contour lines it rises higher and higher.

At the highest point the road does not cut the contour lines?
At the highest point the road does not cut the contour lines?

When the road becomes a tangent of a contour line and does not cut any higher contour lines, it reaches the highest point because on the both side of the tangent point the road goes away from the contour line implying the road goes downwards in both directions and the tangent point is the highest point on the road.

Let us use a general contour plot with a reference line to formalize the idea described above. As the red line moves from the lower right corner toward the upper left, it crosses contour lines with increasing values, meaning the surface rises along the line. At point A, it reaches the highest point with a value of 10. After that, as it crosses contour lines with decreasing values, the surface descends.

Now if we cut a slice at z = 10, we have the contour line of \(f\) (the yellow ellipse) and the contour of \(g\) the red line touch each other a A, they share the same tangent line. Since the gradient \(\nabla f\) and \(\nabla g\) are perpendicular to the shared tangent line. They must parallel to each other. This leads to

\[\nabla f = \lambda \nabla g,\] where $$ is a scalar number. Rearrange the equation above we have: \[\nabla f - \lambda \nabla g = 0.\] This can be seen as the FOC applied on the Lagrange function: \[ \mathcal L(x,\lambda) = f(x)-\lambda g(x)\] The FOC of the Lagrange function \(\nabla \mathcal L(x,\lambda) = 0\) leads to \[ \nabla f(x) - \lambda \nabla g(x)=0, \hspace{1cm} g(x)=0.\].

The following 3D diagram shows the objective surface \(f(x,y)\), the constraint plane \(g(x,y)\), the contour lines of \(f\) and \(g\) and the shared tangent, as well the the gradients of \(f\) and \(g\) at the tangent point.

The argument above holds also for nonlinear constraints. (see the following graph)

Now we are in position to formulate the Lagrange approach to solve constrained optimisation problems.

Formal Formulation of the Problem

For a multivariable function \(f:\mathbb{R}^n \to \mathbb{R}\) we want find the minimum under \(m\) constraints \(h_i(x)=0\) for \(i = 1,2,...,m\) where each constraint is a multivariable function \(h_i:\mathbb{R}^n \to \mathbb{R}\).

\[\min_{x \in \mathbb{R}^n} f(x)\] \[\text{s.t.} \hspace{0.5cm} h_i(x) = 0, \hspace{0.5cm} i = 1, \dots, m\]

Definition 2.1 (Lagrange Function)

\[\mathcal{L}(x, \lambda) = f(x) + \sum_{i=1}^m \lambda_i g_i(x)\] The multivariable function \(\mathcal{L}(x, \lambda)\) is called Lagrange function.

Definition 2.2 (Lagrange First-Order Condition): For the function \(f:\mathbb{R}^n \to \mathbb{R}\) to have a local extremum at \(x = x_0\) under the constraints \(h_i(x_0)=0\), it is necessary that:

\[ \nabla_x \mathcal{L}(x, \lambda) = \nabla f(x) + \sum_{i=1}^m \lambda_i \nabla g_i(x) = 0, \quad g_i(x) = 0 \quad \text{for all } i\]

Definition 2.3 (Lagrange Second-Order Condition): A critical point \(x = x_0\) is a local minimum if the following condition holds

For all nonzero vectors \(d \in \mathbb{R}^n\) such that

\[\nabla g_i(x^*)^\top d = 0 \text{ for all } i = 1, \dots, m, \\[0.5em]\]

\[d^\top \nabla^2_{xx} \mathcal{L}(x^*, \lambda) d > 0.\] The Lagrange second order condition is less restrictive than second order condition in the the unconstrained case where positive definite Hessian requires the above inequality must be satisfied by any \(d\). The Lagrange SOC requires only those \(d\) in the direction of the constraints (or perpendicular to the gradient of the constraints.)

The above is a general formulation of optimisation with \(m\) constraints. Often \(m=1\) we have only one constraint.

Example 5.4: Budget Allocation

A firm wants to allocate its $100000 advertising budget between two channels:

  • Online advertising: $\(x\)
  • TV commercials: $\(y\)

The estimated revenue generated from these investments is: \[ R(x, y) = 200\sqrt{x} + 300\sqrt{y} \] The firm must satisfy the total budget constraint: \[ x + y = 100{,}000 \]

We formulate this business problem as an maximisation problem subject to the equality constraint: \[ \max_{x, y} \quad R(x, y) = 200\sqrt{x} + 300\sqrt{y} \] \[ \text{subject to } x + y = 100000 \]

Step 1 We form the Lagrangian: \[ \mathcal{L}(x, y, \lambda) = 200\sqrt{x} + 300\sqrt{y} - \lambda(x + y - 100{,}000) \]

Step 2 First-Order Conditions (FOCs)

\[\begin{align*} \frac{\partial \mathcal{L}}{\partial x} &= \frac{100}{\sqrt{x}} - \lambda = 0 \quad (1) \\ \frac{\partial \mathcal{L}}{\partial y} &= \frac{150}{\sqrt{y}} - \lambda = 0 \quad (2) \\ \frac{\partial \mathcal{L}}{\partial \lambda} &= x + y - 100{,}000 = 0 \quad (3) \end{align*}\]

We have three unknowns \(x,y,\lambda\) and three equations. From equations (1) and (2): \[ \frac{100}{\sqrt{x}} = \frac{150}{\sqrt{y}} \Rightarrow \frac{\sqrt{y}}{\sqrt{x}} = \frac{3}{2} \Rightarrow \frac{y}{x} = \left(\frac{3}{2}\right)^2 = \frac{9}{4} \Rightarrow y = \frac{9}{4}x \]

Substitute into the budget constraint: \[ x + \frac{9}{4}x = 100{,}000 \Rightarrow \frac{13}{4}x = 100{,}000 \Rightarrow x = \frac{400{,}000}{13} \approx 30{,}769.23 \] \[ y = \frac{9}{4}x = \frac{900{,}000}{13} \approx 69{,}230.77 \]

Step 3 Second-Order Condition (SOC)

We verify the SOC using the bordered Hessian approach. Note that the constraint is linear and the second order derivative of linear function is zero. We have \[\nabla^2 \mathcal{L}(x,\lambda)_{x,y} = \nabla^2 R(x,\lambda)_{x,y}+\nabla^2 \lambda(x+y-100000)=\nabla^2 R(x,\lambda)_{x,y}\] We need only to calculate Hessian of \(R(x,y)\) for the SOC.

The second derivatives are: \[ \frac{\partial^2 R}{\partial x^2} = -100x^{-3/2} < 0, \quad \frac{\partial^2 R}{\partial y^2} = -150y^{-3/2} < 0 \]

Hessian matrix: \[ H = \begin{bmatrix} -100x^{-3/2} & 0 \\ 0 & -150y^{-3/2} \end{bmatrix} \]

Constraint gradient: \[ \nabla g = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \]

Tangent space vectors satisfy: \[ d = \begin{bmatrix} 1 \\ -1 \end{bmatrix} \] It is to note that to determine whether a matrix is positive definite, we need to consider arbitrary \(d\). Here we only need to consider specific \(d\): same components with opposite signs.

Evaluate the SOC: \[ d^\top \mathcal{L}_{x,y}d = d^\top R_{x,y}d = d^\top H d = -100x^{-3/2} - 150y^{-3/2} < 0 \]

Therefore, the Hessian is negative definite on the tangent space \(\Rightarrow\) local maximum.

Determine the maximum revenue

\[R(x^*,y^*)=200\sqrt{30769.23}+300\sqrt{69230.77}=114017.5\]

Numerical R solution

library(Rsolnp)

# Objective function to maximize — note: Rsolnp minimizes by default, so we negate it
fn <- function(par) {
  x <- par[1]
  y <- par[2]
  return(-(200 * sqrt(x) + 300 * sqrt(y)))  # negative for maximization
}

# Equality constraint function: x + y = 100000
eqfun <- function(par) {
  return(sum(par))
}

# Equality bound
eqB <- 100000

# Initial guess
start <- c(50000, 50000)

# Solve with Rsolnp
res <- solnp(pars = start,
             fun = fn,
             eqfun = eqfun,
             eqB = eqB,
             LB = c(0, 0))  # lower bounds for x and y
## 
## Iter: 1 fn: -114017.5423  Pars:  30763.47320 69236.52680
## Iter: 2 fn: -114017.5423  Pars:  30763.47320 69236.52680
## solnp--> Completed in 2 iterations
# Extract solution
opt_x <- res$pars[1]
opt_y <- res$pars[2]
opt_value <- -res$values[length(res$values)]  # reverse the negation
#cat("Optimal x:", opt_x, "\n")
#cat("Optimal y:", opt_y, "\n")
#cat("Maximum Revenue:", opt_value, "\n")

Interactive Visualization

Hover your mouse over the solution point, and you can see that the constraining plane becomes tangent to the contour line projected onto the (x, y) plane.

Interactive Quiz 5.2


Knowledge Point 3:Optimisation with Equlity and Inequality Constraints (KKT Conditions)

Problem Formulation

Definition 3.1 (KKT Conditions): For the problem: \[\min f(x) \quad \text{subject to} \quad g_i(x) = 0, \quad h_j(x) \leq 0\]

The Karush-Kuhn-Tucker conditions are:

  1. Stationarity: \(\nabla f(x^*) + \sum_i \lambda_i \nabla g_i(x^*) + \sum_j \mu_j \nabla h_j(x^*) = 0\)
  2. Primal feasibility: \(g_i(x^*) = 0\), \(h_j(x^*) \leq 0\)
  3. Dual feasibility: \(\mu_j \geq 0\)
  4. Complementary slackness: \(\mu_j h_j(x^*) = 0\)

Comments: At the critical point, we can classify the inequality constraints into two groups: the binding group and the non-binding group. For the non-binding group, the inequality constraints do not play any role at the critical point; hence, they drop out of the Lagrangian function by setting the corresponding \(\mu_j = 0\). For the binding group, the inequality constraints become equality constraints, and we solve the problem as if only equality constraints were present, using the Lagrangian approach.

This is reflected in the Complementary slackness \(\mu_j h_j(x^*) = 0\).

Economic Interpretation: The multiplier \(\mu_j\) represents the shadow price of constraint \(j\). If \(\mu_j > 0\), the constraint is binding (active). If \(\mu_j = 0\), the constraint is not binding.

Example 5.3 Media plan under cuts and reach compliance

\[ \min_{x,y}\; f(x,y)=(x-5)^2+(y-5)^2 \]

Constraints.
\[ x+y=3, \] \[ x+2y\ge 5. \]

First-order conditions(KKT) Write the standard constraints \(h(x,y)=x+y-3=0\) and \(g(x,y)=5-(x+2y)\le 0\).

Lagrangian: \[ \mathcal L(x,y,\lambda,\mu)=(x-5)^2+(y-5)^2+\lambda\,h(x,y)+\mu\,g(x,y),\quad \mu\ge 0. \]

Stationarity: \[ \partial_x\mathcal L=2(x-5)+\lambda-\mu=0,\qquad \partial_y\mathcal L=2(y-5)+\lambda-2\mu=0. \] Feasibility: \(x+y=3,\; x+2y\ge 5\).

Dual feas./slackness: \(\mu\ge 0,\; \mu\,g(x,y)=0\).

Because the unconstrained projection onto \(x+y=3\) is \((1.5,1.5)\) and violates \(x+2y\ge 5\), the inequality is active at optimum: \(x+2y=5\). Solving \[ x+y=3,\quad x+2y=5 \;\Rightarrow\; (x^*,y^*)=(1,2). \] Multipliers from stationarity: \[ \lambda-\mu=8,\quad \lambda-2\mu=6 \;\Rightarrow\; (\lambda^*,\mu^*)=(10,2). \]

Second-order condition (SOC)

\(\nabla^2 f=2I\succ 0\) (strictly convex objective) and the constraints are affine, so the KKT point is the unique global minimizer.

Managerial interpretation

  • Constrained Optimum: is reached at (1,2), which is the closest feasible point to the unconstrained minimum at (5,5).
  • Shadow prices: \(\lambda^*=10\) \(\mu^*=2\). Implies:
    • An increase of the equality constraint by 1 would reduce squared deviation by the objective by 10.
    • Relaxing the inequality constrain by 1 would reduce the objective by 2.

Numerical R Solution

# install.packages("Rsolnp") # if needed
library(Rsolnp)

# Objective
fn <- function(w){
  x <- w[1]; y <- w[2]
  (x - 5)^2 + (y - 5)^2
}

# Equality: x + y = 3
eqf <- function(w){
  w[1] + w[2]
}

# Inequality: x + 2y >= 5  -> return x+2y with lower bound 5
ineqf <- function(w){
  w[1] + 2*w[2]
}

ini <- c(0, 0)

res <- solnp(
  pars   = ini,
  fun    = fn,
  eqfun  = eqf,   eqB = 3,
  ineqfun = ineqf, ineqLB = 5, ineqUB = 1e6,
  LB = c(-1e6, -1e6), UB = c(1e6, 1e6)
)
## 
## Iter: 1 fn: 25.0000   Pars:  1.00000 2.00000
## Iter: 2 fn: 25.0000   Pars:  1.00000 2.00000
## solnp--> Completed in 2 iterations
res$pars      # should be ~ c(1, 2)
## [1] 1 2
res$values
## [1] 50 25 25
## --- KKT multipliers from stationarity ---
x <- res$pars[1]; y <- res$pars[2]

# Stationarity system:
# 2(x-5) + lambda - mu = 0
# 2(y-5) + lambda - 2*mu = 0
A <- rbind(c(1, -1), c(1, -2))
b <- c(-2*(x - 5), -2*(y - 5))
sol <- solve(A, b)
lambda <- sol[1]; mu <- sol[2]

list(x = x, y = y, lambda = lambda, mu = mu,
     eq_ok = abs(x + y - 3) < 1e-8,
     ineq_val = x + 2*y, ineq_active = abs(x + 2*y - 5) < 1e-8)
## $x
## [1] 1
## 
## $y
## [1] 2
## 
## $lambda
## [1] 10
## 
## $mu
## [1] 2
## 
## $eq_ok
## [1] TRUE
## 
## $ineq_val
## [1] 5
## 
## $ineq_active
## [1] FALSE

Interactive Visualization

The interactive graph below shows the objective surface with its unconstrained minimum marked by the yellow point. The green plane represents the equality constraint, and the dark violet triangular prism represents the inequality constraint. The solution point is located where the objective surface, the green plane, and the boundary of the violet prism all meet. Here, the inequality constraint is binding.

The following contour plot with equality constraint (the yellow line) and inequality constraint (light yellow area) and the solution point demonstrate an alternative view of the problem.

Exercise

A manufacturer produces two standard products, \(x\) and \(y\), and one premium product, \(z\).
The profit function (in thousands of dollars) is given by: \[ \pi(x,y,z) = 40x^{0.5}y^{0.3}z^{0.2} - 0.4x^2 - 0.3y^2 - 0.5z^2 \] where \(x, y, z \ge 0\) are production quantities (in hundreds of units).

The firm faces the following constraints:

  1. Resource constraint (equality)
    The total labor used by all products must equal 2400 hours: \[ 4x + 6y + 8z = 2400 \] where the coefficients represent labor hours per hundred units.

  2. Market constraint (inequality)
    The premium product’s output cannot exceed 60% of the total production: \[ z \le 0.6(x + y + z) \]

Task: 1. Write down the Lagrangian function for the problem, introducing a Lagrange multiplier \(\lambda\) for the equality constraint and \(\mu\) for the inequality constraint. 2. State the Karush–Kuhn–Tucker (KKT) conditions. 3. Solve the first-order conditions along with the constraints to find the optimal \((x^*, y^*, z^*)\), the associated multipliers, and the maximum profit. 4. Interpret the economic meaning of the Lagrange multipliers \(\lambda\) and \(\mu\).

Hint: Remember that for the inequality constraint \(g(x,y,z) \le 0\), the KKT complementary slackness condition requires \(\mu \cdot g(x,y,z) = 0\) and \(\mu \ge 0\).

Solve the problem by solnp() in following R Environment or in your local RStudio.

Interactive Quiz 5.3


Knowledge Point 4: Business Applications

Comprehensive Optimization Examples

Real business optimization problems often combine multiple concepts: unconstrained optimization for initial analysis, equality constraints for budget limitations, and inequality constraints for practical restrictions.

Example 4.1: Multi-Product Pricing Strategy

Problem Statement

A firm chooses production levels of two products, \(x\) and \(y\). Profit is given by

\[ \Pi(x,y) = 120x + 100y - 2x^2 - y^2 - xy . \]

This is an unconstrained maximisation problem.


Analytical Solution

First-Order Conditions (FOC)

\[ \frac{\partial \Pi}{\partial x} = 120 - 4x - y = 0, \quad \frac{\partial \Pi}{\partial y} = 100 - x - 2y = 0 \]

Solving:

\[ x^* = 20, \quad y^* = 40 \]

Second-Order Conditions (SOC)

The Hessian is

\[ H = \begin{bmatrix} -4 & -1 \\ -1 & -2 \end{bmatrix} \]

\(\det(H) = 7 > 0\) and \(-4 < 0\)\(H\) is negative definite ⇒ local (and here global) maximum.


Numerical Solution in R

# Profit function
Pi <- function(v) {
  x <- v[1]; y <- v[2]
  120*x + 100*y - 2*x^2 - y^2 - x*y
}

# Negate because optim() minimizes
negPi <- function(v) -Pi(v)

# Gradient of -Pi
grad_negPi <- function(v) {
  x <- v[1]; y <- v[2]
  gx <- -(120 - 4*x - y)
  gy <- -(100 - x - 2*y)
  c(gx, gy)
}

start <- c(0, 0)
fit <- optim(
  par = start,
  fn = negPi,
  gr = grad_negPi,
  method = "BFGS",
  control = list(reltol = 1e-12)
)

fit$par       # x*, y*
## [1] 20.00000 39.99998
-Pi(fit$par)  # Profit at optimum
## [1] -3200

Contour Plot

Interactive 3D Surface Plot

Example 4.2: Portfolio Selection with Two Assets

Problem Setup (component form)

We form a portfolio of two ASX stocks: CBA.AX and BHP.AX. Let weights be \(w_1\) (CBA) and \(w_2\) (BHP), with the budget constraint \(w_1 + w_2 = 1\). We maximize mean–variance utility with risk aversion \(\gamma>0\): \[ \max_{w_1,w_2}\; \hat{\mu}_1 w_1 + \hat{\mu}_2 w_2 -\frac{\gamma}{2}\Big[\hat{\sigma}_1^2 w_1^2 + 2\hat{\sigma}_{12} w_1 w_2 + \hat{\sigma}_2^2 w_2^2\Big]\]

\[ \text{s.t. } w_1+w_2=1.\]

Here \(\hat{\mu}_i\) are sample mean returns, and \[ \hat{\Sigma}= \begin{bmatrix} \hat{\sigma}_1^2 & \hat{\sigma}_{12}\\ \hat{\sigma}_{12} & \hat{\sigma}_2^2 \end{bmatrix} \quad\text{is the sample covariance matrix.} \]


Data: Monthly Prices (last 5 years) and Empirical Moments

Problem setup (data-driven two-asset portfolio)

We form a portfolio of two ASX stocks: CBA.AX (Commonwealth Bank) and BHP.AX (BHP Group).
Let weights be \(w_1\) (CBA) and \(w_2\) (BHP), with the budget constraint \(w_1 + w_2 = 1\).

We will estimate monthly log-return means and covariances from the last 5 years of data, then solve a mean–variance utility maximisation problem.


Data and empirical estimation

Let \(P_{i,t}\) be the adjusted closing price of asset \(i \in \{1,2\}\) at month \(t\) (with \(i=1\) for CBA, \(i=2\) for BHP). Define monthly log returns \[ r_{i t} \;=\; \ln\!\left(\frac{P_{i,t}}{P_{i,t-1}}\right), \qquad t=1,\dots,T. \]

Sample mean (monthly) \[ \hat{\mu} = \begin{bmatrix} \hat{\mu}_1\\[2pt] \hat{\mu}_2 \end{bmatrix} = \begin{bmatrix} \frac{1}{T}\sum_{t=1}^T r_{1t}\\[6pt] \frac{1}{T}\sum_{t=1}^T r_{2t} \end{bmatrix}. \]

Sample covariance (monthly) \[ \hat{\Sigma} = \begin{bmatrix} \hat{\sigma}_1^2 & \hat{\sigma}_{12}\\[2pt] \hat{\sigma}_{12} & \hat{\sigma}_2^2 \end{bmatrix}, \quad \hat{\sigma}_1^2=\frac{1}{T-1}\sum_{t=1}^T (r_{1t}-\hat{\mu}_1)^2,\; \hat{\sigma}_2^2=\frac{1}{T-1}\sum_{t=1}^T (r_{2t}-\hat{\mu}_2)^2,\; \hat{\sigma}_{12}=\frac{1}{T-1}\sum_{t=1}^T (r_{1t}-\hat{\mu}_1)(r_{2t}-\hat{\mu}_2). \]

(Optional annualisation: multiply means by \(12\) and standard deviations by \(\sqrt{12}\).)


Optimisation problem

Given risk aversion \(\gamma>0\), choose \((w_1,w_2)\) to maximise mean–variance utility: \[ \max_{w_1,w_2}\;\; \hat{\mu}_1 w_1 + \hat{\mu}_2 w_2 \;-\;\frac{\gamma}{2}\left( \hat{\sigma}_1^2 w_1^2 + 2\hat{\sigma}_{12} w_1 w_2 + \hat{\sigma}_2^2 w_2^2 \right) \quad\text{s.t.}\quad w_1 + w_2 = 1. \]

(No-shorting \(0\le w_i\le1\) can be added later; here we keep only the equality constraint.)


Analytical solution (equality constraint only)

Lagrangian \[ \mathcal{L}(w_1,w_2,\lambda) = \hat{\mu}_1 w_1 + \hat{\mu}_2 w_2 -\frac{\gamma}{2}\!\left(\hat{\sigma}_1^2 w_1^2 + 2\hat{\sigma}_{12} w_1 w_2 + \hat{\sigma}_2^2 w_2^2\right) +\lambda(1-w_1-w_2). \]

First-order conditions (FOC) \[ \frac{\partial \mathcal{L}}{\partial w_1}:\quad \hat{\mu}_1 - \gamma(\hat{\sigma}_1^2 w_1 + \hat{\sigma}_{12} w_2) - \lambda = 0, \] \[ \frac{\partial \mathcal{L}}{\partial w_2}:\quad \hat{\mu}_2 - \gamma(\hat{\sigma}_{12} w_1 + \hat{\sigma}_2^2 w_2) - \lambda = 0, \] \[ \frac{\partial \mathcal{L}}{\partial \lambda}:\quad w_1 + w_2 = 1. \]

Second-order condition (SOC)
The Hessian with respect to \((w_1,w_2)\) is \[ H = -\gamma \begin{bmatrix} \hat{\sigma}_1^2 & \hat{\sigma}_{12}\\[2pt] \hat{\sigma}_{12} & \hat{\sigma}_2^2 \end{bmatrix}. \] Since \(\hat{\Sigma}\succ 0\) and \(\gamma>0\), we have \(H\prec 0\) (negative definite), so the FOC point is a unique global maximiser.


Solving step-by-step

  1. Subtract the two FOCs to eliminate \(\lambda\):

  2. Substitute \(w_2 = 1 - w_1\) and solve for \(w_1\): \[ \boxed{ w_1^* = \frac{(\hat{\mu}_1 - \hat{\mu}_2)\;+\;\gamma(\hat{\sigma}_2^2 - \hat{\sigma}_{12})} {\gamma\big(\hat{\sigma}_1^2 + \hat{\sigma}_2^2 - 2\hat{\sigma}_{12}\big)} }, \qquad \boxed{\,w_2^* = 1 - w_1^*\, .} \]

\[ \boxed{ \lambda = \frac{ \big(\hat{\sigma}_2^2 - \hat{\sigma}_{12}\big)\hat{\mu}_1 \;+\; \big(\hat{\sigma}_1^2 - \hat{\sigma}_{12}\big)\hat{\mu}_2 \;-\; \gamma\big(\hat{\sigma}_1^2 + \hat{\sigma}_2^2 - 2\hat{\sigma}_{12}\big) }{ \hat{\sigma}_1^2 + \hat{\sigma}_2^2 - 2\hat{\sigma}_{12} } } \]

The denominator equals \(\mathrm{Var}(r_{1}-r_{2})=\hat{\sigma}_1^2+\hat{\sigma}_2^2-2\hat{\sigma}_{12}\), which is positive unless the assets are perfectly linearly dependent.

R code for empirical data extraction

# install.packages(c("tidyquant","dplyr","tidyr","lubridate"), dependencies = TRUE)
library(tidyquant)
library(dplyr)
library(tidyr)
library(lubridate)

symbols <- c("CBA.AX", "BHP.AX")
start   <- floor_date(Sys.Date() - years(5), unit = "month")
end     <- Sys.Date()

# 1) Download daily prices, then compute monthly log returns from adjusted prices
prices <- tq_get(symbols, from = start, to = end, get = "stock.prices")

monthly_ret <- prices %>%
  group_by(symbol) %>%
  tq_transmute(
    select     = adjusted,
    mutate_fun = periodReturn,
    period     = "monthly",
    type       = "log",
    col_rename = "ret"
  ) %>%
  ungroup()

# 2) Wide matrix of returns: rows = months, columns = symbols
R <- monthly_ret %>%
  select(symbol, date, ret) %>%
  pivot_wider(names_from = symbol, values_from = ret) %>%
  arrange(date) %>%
  drop_na()

# 3) Empirical moments (monthly log returns)
r_mat <- as.matrix(R[, symbols])
mu    <- colMeans(r_mat)   # vector c(mu_CBA, mu_BHP)
Sigma <- cov(r_mat)        # 2x2 covariance matrix

# Print results
mu
##     CBA.AX     BHP.AX 
## 0.01786453 0.01022144
Sigma
##              CBA.AX       BHP.AX
## CBA.AX 3.592835e-03 9.802353e-05
## BHP.AX 9.802353e-05 4.819121e-03

R code for constrained optimization

library(Rsolnp)
# Three-asset portfolio optimization
mu <- c(0.08, 0.12)  # Expected returns
Sigma <- matrix(c(0.04, 0.01,
                  0.01, 0.09), nrow = 2)  # Covariance matrix


Omega <- Sigma
R<-mu
n=2
w = c(1/n,n)

fn1=function(w)
{
  -t(R)%*%w + t(w)%*%Omega%*%w     ### for gamma = 2
}

### the constraint that the weights sum to 1
eqn1=function(w){
  return(sum(w))
}

### the initial value for numerical procedure 
ini = c(0.5,0.5)
solution=solnp(ini, fun = fn1, eqfun = eqn1, eqB = c(1),LB=rep(0,n),UB=rep(1,n))
## 
## Iter: 1 fn: -0.06273  Pars:  0.54545 0.45455
## Iter: 2 fn: -0.06273  Pars:  0.54545 0.45455
## solnp--> Completed in 2 iterations
### the optimal values
solution$par
## [1] 0.5454546 0.4545454
fn1(solution$par)
##             [,1]
## [1,] -0.06272727
sum(solution$par)
## [1] 1

Interactive Quiz 5.4


Knowledge Point 5: Solution Methods in R

Computational Tools for Optimization

R provides several powerful tools for solving optimization problems. Understanding when and how to use each tool is crucial for practical problem-solving.

R Optimization Functions Overview

1. optim() - General-purpose optimization

  • Best for: Unconstrained problems, simple constraints
  • Methods: BFGS, L-BFGS-B, Nelder-Mead, CG
  • Advantages: Flexible, handles multiple variables
  • Limitations: No direct constraint handling

2. constrOptim() - Constrained optimization

  • Best for: Linear inequality constraints
  • Method: Barrier method with penalty functions
  • Advantages: Handles constraints directly
  • Limitations: Only inequality constraints, can be slow

3. lpSolve::lp() - Linear programming

  • Best for: Linear objective and constraints
  • Advantages: Very fast, guaranteed global optimum
  • Limitations: Only linear problems

4. quadprog::solve.QP() - Quadratic programming

  • Best for: Quadratic objective, linear constraints
  • Advantages: Efficient for portfolio optimization
  • Limitations: Requires specific problem structure
Function / Package Best For Constraints Supported Advantages Limitations
optim() Unconstrained problems or simple box constraints None, box ("L-BFGS-B") Flexible, built-in, multiple methods (BFGS, CG, Nelder-Mead, etc.) No general equality/inequality constraints
constrOptim() Problems with linear inequality constraints Linear inequality Direct constraint handling, built into R No equality constraints, slower for large problems
lpSolve::lp() Linear objective with linear constraints Linear equality & inequality Very fast, guaranteed global optimum Only linear problems
quadprog::solve.QP() Quadratic objective with linear constraints Linear equality & inequality Efficient for mean–variance portfolio optimisation Requires positive-definite quadratic term, linear constraints only
Rsolnp::solnp() Nonlinear objective with equality/inequality constraints Nonlinear equality & inequality + bounds Handles general nonlinear constraints; supports bounds Needs good starting values; may converge to local optima
nloptr::nloptr() Large/complex nonlinear optimisation Nonlinear equality & inequality + bounds Many algorithms (global & local), gradient-free or gradient-based Algorithm choice/tuning is critical
DEoptim::DEoptim() Global optimisation, non-smooth or multi-modal problems Bounds only Robust to local minima; no gradient required Slower for smooth problems; needs tuning population size & iterations
GenSA::GenSA() Global optimisation over complex search spaces Bounds only Works for discrete or rugged landscapes; automatic annealing schedule Can be slow; stochastic results vary
optimx::optimx() Comparing multiple local optimisation methods Same as optim() methods used Simple interface to try several methods and compare results Constraint support limited to methods used

Best Practices

  1. Start simple: Use analytical solutions when possible
  2. Check feasibility: Ensure initial points satisfy constraints
  3. Verify results: Compare multiple methods
  4. Scale variables: Normalize for numerical stability
  5. Set appropriate tolerances: Balance accuracy and computation time

Interactive Quiz 5.5


Comprehensive Review Questions

This section provides review and application questions designed to assess understanding at five different levels. Each level builds upon the previous one, fostering deeper comprehension and the ability to apply concepts in increasingly complex situations.

https://pchen-exercisesfeedbackselect.hf.space


Summary and Key Takeaways

Core Concepts Mastered

  1. Unconstrained Optimization: FOC and SOC provide necessary and sufficient conditions for finding extrema
  2. Lagrange Multipliers: Enable optimization with equality constraints, with multipliers representing shadow prices
  3. KKT Conditions: Extend optimization to inequality constraints, crucial for real-world business problems
  4. Business Applications: Optimization appears in pricing, production, portfolio management, and resource allocation
  5. Computational Methods: R provides multiple tools, each suited for different problem types

Business Relevance

Optimization is fundamental to business decision-making: - Strategic Planning: Resource allocation and capacity planning - Financial Management: Portfolio optimization and risk management
- Operations: Production scheduling and supply chain optimization - Marketing: Pricing strategies and budget allocation

Mathematical Foundation

The progression from unconstrained to constrained optimization mirrors the evolution from theoretical models to practical business applications. Understanding both the mathematical theory and computational implementation enables effective problem-solving in complex business environments.


This concludes Section 5: Optimization. Students should now be equipped to tackle real-world optimization problems using both analytical and computational approaches.